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, The aim of the present paper is to give axiomatic characterization of quantum relative entropy utilizing 

resource conversion scenario. We consider two sets of axioms: non-asymptotic and asymptotic. In the former 
setting, we prove that the upperbound and the lowerbund of (p||<r) is "D R (p||er) := tr pin y / pcr~ 1 y / p and 
D(p||(j) := tr p (In p — In a), respectively. In the latter setting, we prove uniqueness of quantum relative 
entropy, that is, D*^ (p||cr) should equal a constant multiple of D (p||<r). In the analysis, we define and use 
reverse test and asymptotic reverse test, which are natural inverse of hypothesis test. 

a 

1 Introduction 

Many problems in quantum/classical information theory can be viewed as conversion between given resources 
and 'standard' resources, and such viewpoint had turned out to be very fruitful. This manuscript will exploit 
this scenario in asymptotic theory of quantum estimation theory (with some comments on classical estimation 
C*~) . theory). Resource conversion scenario was first explored in axiomatic theory of entanglement measures. The 
optimal asymptotic conversion ratio from maximally entangled states ('standard' resource) to a given state is 
called entanglement cost, while the optimal ratio for inverse conversion is called distillable entanglement. It 
had been shown that all quantities which satisfies a set of reasonable axioms takes value between these two 
quantities. Similar argument had been applied to classical/quantum channels, and so on. 

The aim of the present paper is to give axiomatic characterization of quantum relative entropy utilizing 
resource conversion scenario. We consider two sets of axioms: non- asymptotic and asymptotic. In both cases, 
we require a quantum relative entropy (p||<r) is monotone decreasing by application of any CPTP map. 
In addition, in the former setting, we assume quantum relative entropy coincide with its classical counterpart 
for probability distributions {p, q}: (p\\q) = D(p\\q). Then we can prove that the upperbound and the 
lowerbound of (p\\cr) is 

D R (p\\a) -trplnVp^VP, 



and 

D (p\ |er) :— tr p (In p — In a) , 

respectively. In the latter setting, in stead, D 1 ^ is supposed to satisfy some asymptotic properties, namely 

weak additivity and lower asymptotic continuity, which will be defined later. Under such assumptions, we prove 
uniqueness of quantum relative entropy, that is, (p||er) should equal a constant multiple of D (p||<r). 

In the analysis, newly defined reverse test and asymptotic reverse test play key role. The former is a 
conversion from a pair {p,q} of probability distributions to a pair {p, o~} of quantum states, and the latter 
is an approximate conversion from a pair {p n ,q n } of probability distributions over the binary set {0,1} to a 
pair {p n ,a n } of quantum states. Each of them is natural inverse of optimal measurement for hypothesis and 
hypothesis test of Neyman-Pearson type, and optimal measurement for hypothesis test, respectively. 

In the course of analyzing reverse test, we show operational meaning of RLD Fisher information. Also, we 
prove joint convexity of D R (p\\cr). 
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2 Main results 

In the paper, the totality of density operators in the Hilbcrt space % is denoted by S (H), and the totality 
rank r elements is denoted by S r (%). Unless otherwise mentioned, we suppose d := dim"H < oo. We consider 
following conditions. 

(M) (Monotonicity) For any CPTP map A, 

D Q (p|k)>D<2 (A (p) ||A(ct)) . 

(N) (Normalization) For any probability distributions {p, q}, 

D«(p||g) = D(p||g) :=5>(s) (Inp (x) - Inq (x)) . 

X 

(A) (Weak additivity) 

(C) (Lower asymptotic continuity) 

lim \\p n -p® n \\ = 0=> lim - {DO (p"||cr 8 ™) -D^ (^™||ct 8 ")} > 0. (1) 

n-S-oo n 

Define 

D (p\\u) := trp (hip — In cr) , 
D fl (p||a) t-trpln^a- 1 ^, 

and denote by M (p) the probability distribution of the data from the application of the measurement M to p. 

Theorem 2.1 If (M) and (N) are satisfied, 

maxD(M( i o)||M((7)) <D Q (p\\a) <D R (p\\a) . 

Theorem 2.2 // (M), (N) and (A) are satisfied, 

D(p||«r)<D« (p\\a)<D R (p\\a) . 
Theorem 2.3 If (M), (A), and (C) are satisfied, 

D Q (p\\a) = const, x D(p||a). 

2.1 Proof of Main theorems 

Below, reverse test of a pair of states {p, a} means the triplet ($, {p, q}) of a CPTP map $ and probability 
distributions p, q with 

$ (p) =/£>,$ (<?) = cr. 
We use following theorems to prove main theorems: 

Theorem 2.4 

minD(p||g) =D R (p\\a) , 
where minimization is taken for over all reverse tests {p, q}) of{p,a}. 
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Theorem 2.5 (Hiai-Petz J3$) For any states p and a, and constant c > ; we can find a projective measurement 
M™ := {P n , 1 - P n } such that: 

lim tvP n p® n = 1, 

Tl—¥OQ 

lim Z_ i n tr P n a® n > D(p| \a) - c, 

n 

lim -D(M" (p®«) || M" (<r® n )) > D(pllcr) - c. 

n— >oo fl 

Proposition 2.6 D(p||er) satisfies the condition (C). 
Proof.By Fannes's inequality, when n is very large, 

- \D (f/ an \\a 9n ) -D(p"||a®")| 

< ||p® n - p n \\And+- 
n in n 

+ Hp ™ — 1 1 x {the first eigenvalue of lncr} 

->■ 0. 

■ 

Theorem 2.7 7/D (po||co) > D (p||ct), there is a sequence {^> n } of TPCP map with 

lim ||*" (p® n ) - p® n \\. = 0, (af n ) = a® n . (2) 

Conversely, if such {*"} with fij^l exists, D (po\\o~o) > D(p||er). 

Proof of Theorem l2.4l and Theorem l2 . 71 will be given later in Subsection l4.2l and Subsection l5.3l . respectively. 
Here, we use these to prove the main theorems. 

Proof of Theorem^!} That B R satisfies (M) is known [5], but here we give another proof. By 
Theoreml2T4l 

D fl (A (p) ||A (a)) = min { D (p\\q) ; $ (p) = A (p) , $ (g) - A (a)} 
< min{D (p||g) ; $ = * o A, s.t., * (p) = p, * (q) = a] 
= min {D (p| |g) ; * (p) = p, * ( 9 ) = a} 
= D*(p||<r). 

So D R satisfies (M). (N) is obviously satisfied. Also, that maxjw D (Af (p) \ \M (er)) satisfies (M) and (N) is 
trivial. 

Letting ($, {p, q}) be an optimal reverse test, due to (N) and (M), 

D*(p||<T)=D(p||g) = D«(p||g) 
> D«($(p)||$(g))=D«(p||a). 

(M) 

Also, 

D(M(p)||M(a)) = T>Q(M(p)\\M(*)) < B Q (p\\a). 

(N) (M) 

■ 

Proof of Theorem l2.2l Since D- R (p||<j) is weakly additive, we have the upper bound. The lower bound 
is known [5] , but also can be easily obtained by Theorem l2.ll and Theorem l2.5l . ■ 

Proof of Theorem l2.3[ Without loss of generality, we can suppose 

D Q (po|M =D(p || GO) , 
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for some po, &o- F° r given p and a, let I, I', to, ml be integers with 

V I 
— D (polko) < D (p||cr) < — D (polko) ■ 
m to 

By TheoremHTTI there is with 

lim ||*™(pjf n ) -p®"" 1 ^ = 0, 



Since is satisfies (A), (C) , and (M), we have 



mD Q {p\\o) = lim -B Q (p® mn || cr® mn ) 

(A) n->oo n 

< lim (pf™) p™ (a®' ")) 
(C)ii^n 

< lim (pf^lko®"') = » 9 (Polko) 

(M) n^oo H. (A) 

or 

D Q (p||<r) < -DO (p |k ) = -D(p |ko). 
m to 

Exchanging {poi^o} and {p, a} in the above argument, we obtain 

D Q (p||a) > J1d« (p |ko) = -^D(p |k ) . 

TO TO 

Taking ^ and ^7 arbitrarily close to jjtMl^j, we have 

(p|k)=D(p|k). 



3 Monotone metric 

3.1 Classical Fisher Information as a monotone metric 



Let us consider a family of probability distribution { pg;8 £ C R m } over the finite set X, \X\ < 00. A 



logarithmic derivative is defined by lg t i := dihipg, where di := -Jh- Fisher information Jg (sometimes denoted 



as J pg ) is denned by 

J e,i,j ■= ^Pe(x)le,i (x) Igj (x) = ^ k>» ( x ) djpe{x). 

X X 

It is known that, with some regularity condition, the optimal asymptotic mean square error of an estimate of 
equals Jg 1 - 

Being positive definite and covariant by the coordinate change of the parameter space, Jg induces a Rie- 
mannian metric, or an inner product in the tangent space Tg by 

Je (dipg,djpg) := J e< i,j , 

where the representation of Tg is chosen as sp&n{dipg;i = 1, • • • ,to}. This metric brings about the following 
intuitive picture: the precision of estimate is proportional to the distance between pg and pg+dg- ■ 

Hereafter, the differential map of affine map A is also denoted by A, by abusing the notation. Cencov [5] had 
proven : 

Theorem 3.1 [2] Suppose a Riemannian metric g Pe is monotone decreasing by application of Markov maps, 

g pg (X,X)>g A{pff) (A(X),A(X)). 

Then, g Pe is the one induced by Fisher information, up to a constant multiple. 

In the proof, it is essential that the metric is Riemannian, i.e., the norm in the tangent space is denned via 
an inner product. This assumption can be replaced by weak additivity and asymptotic lower continuity [T2] . 
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3.2 SLD and RLD Fisher information 

We consider a family {pg; 9 E C R m } of density operators, and suppose the map 9 — > pg is smooth enough, 
and Q is open. Define a symmetric logarithmic derivative (SLD) Lg 4 and a rig/ii logarithmic derivative (RLD) 
Lg t i as a solution to the matrix equation, 

diPe = ^{Lg. t pe + peL s ei ) = Lf iP g. 

If pg is strictly positive, Lg i and are uniquely defined in this way. If pg has zero eigenvalues, Lg i still can be 
defined, but not uniquely. L^ i exists (and if exists, unique) if and only if dipg has non-zero eigenvalues only in 
the support of pg. Observe they are quantum equivalences of a classical logarithmic derivative, Iqj = di hipg(x). 
SLD Fisher information matrix Jg and RLD Fisher information matrix is defined as 

= ^TrpgLg^Lgj, Jg"ij = Ti- pgLgjLg-^, 

respectively [5]. They are quantum analogues of classical Fisher information Jg, and, being positive definite, 
each of them induces inner product to the tangent space Tg, 

Jg (diPe, djpg) := Jg ti j , Jg {dipg, djpg) := J^j , 

where we represent Tg by span {dipg: i = 1, • • • , m}. We sometimes use notations such as jf e and J^ g to indicate 
that the underlying family of states is {pg}. Even if pg is not full-rank, Jg is uniquely defined, regardless the 
indefiniteness of SLD. 

An operational meaning of SLD Fisher metric is given through estimation of 9 in an asymptotic setting, just 
like its classical counterpart. For the detail, see, for example, [5]. Here, we point out relation of SLD Fisher 
information to classical Fisher information of the family {M (pg)}. 

Theorem 3.2 Q/fHEF 

Jhlipe) < Jp e > ( 3 ) 
Also, for any X G Tg, there is a measurement M with 

J M[pe) (M{X),M{X)) = J s pe (X,X). 

3.3 RLD and Reverse SLD 

Denote by W the totality of matrices W with tr WW^ — 1 . The totality of d x d' elements of W is denoted by 
Wd' , where d' > r = rank p. Consider a map form W to S (H) such that 

W WW\ 

A meaning of this map is as follows. Let 

w = yp- 1 \<t> 1 ),--- ,v5F|&'}], 

then, 

d' 

ww^ = Y,Pi l&> (<t>i\ ■ 

i=l 

Proposition 3.3 There is a Hermitian matrix L with A = BL if and only if AB^ = BA^ and ImA C Im5. 
Proof.Since 'only if ' is trivial, we show ' if '.Consider the singular decomposition of B: 

B = UXV, 

where U and V is d x r- and r x d'- matrix, with U'U = VV* = 1, respectively. Let 

L := V i X~ 1 U i AB i UX~ 1 V + y'^CW + VWCV^V. 
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Here V is (d — r) x d' -matrix with VV' = 1 and VV^ = 0, and C is a matrix with A = £?C (existence of such 
C is due to Imi C ImB) Since AB^ is Hermitian, L is Hermitian. Also, 

BL = uxv {v^x-Wa (UXV) 1 UX^V + V^V&V^V + v^vcv^v} 

= AV*V + UXVV^VC^V + UXVCV^V 

= AV*V + BCV^V 

= A(y^V + V ] V^j = A. 

Hence, L satisfies required condition, and the assertion is proved. ■ 
Since 

{Lf ti W) = L R lPe = p g L R } = W (L^wf , 

due to Proposition l3.3l 

L R i W = WA R i ,3A R i =(A R rf . 

Af i is called the reverse SLD at W. 

On the other hand, let A be an arbitrary dl x dl Hermitian matrix. Observe that the image of WAW^ is a 
subspace of the image of WW* . Therefore, for an arbitrary reverse SLD, there is a RLD, i.e., 

MA = A\ 3L R L R WW^ = WAW\ 

and, letting Q be the projection onto (kerW) -1 = ImW^, 

L R W = WAQ. 

(Especially, if dl = r, L R W = WA.) Therefore, we have, . 

TrpL R ^L R = TrL R WW^L R i = TrWAQAW^ 

<TtW {AfwK (4) 

Especially, if d! — r, the equality holds. 

3.4 Reverse estimation of quantum state family and RLD 

The heart of quantum statistics is optimization of a measurement, i.e., choice of a measurement which converts 
a family of quantum states to the most informative classical probability distribution family. In estimation of 
the parameter 6 in asymptotic situation, we maximize the output Fisher information Jm(o s ) by modifying M. 

Now, we consider the reverse of above, i.e., generation of the quantum state family {pg}: a pair ($, {pe}) is 
said to be a reverse estimation of {pg} if 

$(pe)=pe, V8ee. 

Classical version of this is nothing but randomization criteria of deficiency, the concept which plays key role in 
statistical decision theory [18]. Let us introduce 'local' version of this condition. We say (<£>, {pg, dipg; i = 1, • • • , to}) 
is tangent reverse estimation of {pg, d^pg; i = 1, • • • , to} if 

$ (Pe) = Pe-, $ (dipg) = dipg (5) 

hold at 8. (In statistical decision theory, when this relation holds, we say {pg,dipg;i = 1, ••• , m} is locally 
deficient relative to{pg, dipg; i = 1, • • • , to} at 9 [TB].) 

Now let us consider the m = 1-case, and optimize ($, {pg,dpg/d8}) to minimize the Fisher information J pg . 
Let us denote by 5 X the delta-distribution at x. Suppose <& (S x ) is pure (this can be supposed without loss of 
generality) and let 

i^x^i :=$(<y. 



() 



Then ([5]) is rewritten as 

d' 

Pe = ^2p e {x) \<j) x ) (<j> x \ , 

x=l 

dpg A dp g (x) ,,,,,, 

x—l 

If p g {x) — and dpg/dO ^ 0, the input Fisher information is infinite. So let us suppose this is not the case, and 
let us define 



w = [VMi)\4>i),--- ,VpJi)\M} 

/ 1 dps (1) 1 dpgjdfy 



Then, the input Fisher information is 

,pe(x) dd 



where the inequality is due to (U). 

On the other hand, let us suppose rank W = r and W satisfies WW^ = pg , and let A be the reverse SLD at 
W, WAW^ = dpg/dO. Then this A achieves the equality of (J4J) . Since (WU) (WU) 1 ' — pg for unitary matrix 
U, we can suppose that A is diagonal, by choosing W properly. Therefore, one can define $ by tracking above 
process inverse way, which achieves identity of Therefore, we have: 

Theorem 3.4 Suppose dimG = 1. Then 

J Pe > J% (8) 

holds for all the reverse estimation of {pg}, and there is a tangent reverse estimation ($, {pg, dpg/d6}) with 



J - T R 



m > 1-case is briefly discussed in AppendixJT] 



3.5 Monotone metric 

In this subsection, to avoid notational complexity, we let m = 1, and abbreviate Jp g (dpg/d8, dpg/dO) as Jp g , 
and so on. Corresponding statement for m > 1-case will be easily obtained by considering its appropriate one 
dimensional subfamily. 

It is known that SLD Fisher metric and RLD Fisher metric are monotone decreasing by application of CPTP 
maps 

tR -> tR tS jS 

and any monotone Riemanian metric g, if a constant factor is properly chosen, takes values between SLD and 
RLD Fisher metric 

J s < a < 7 R 

[17] . In this section, we show the operational proof of the slightly stronger version of these facts. 

First, monotonicity of SLD is trivial because the optimization of measurement applied to the family {A (pg)} 
is equivalent to the optimization of measurement applied to {pg} over the restricted class of measurements of 
the form M o A: 

J A( Pe ) = max JAfoA( Pe ) < max J M ( PB ) = Jp 
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The monotonicity of RLD Fisher metric is proven in the similar manner. Given a tangent reverse estimation 
($, {pg, dpg/d8}) of {pg, dpg/d9}, (A o <£>, {pg}) is a tangent reverse estimation of {A (pg) , A (dpg/dO)}. Since 
{A (pg) , A (dpg/d9)} may have a better tangent reverse estimation, we have 

j£ (pe) = min{J pe ; $ (p e ) = A (j>„) , $ (dp e /d0) = A (d Pe /dO)} 

< min { J pe ; $ = * o A , s.t. * (p fl ) = p e , * (6p e /d0) = dpg/dO} 

Assume that a metric is not increasing by a quantum-classical (QC) channel, and coincides with classical 
Fisher information restricted to classical probability distributions. Then, this metric should be no smaller than 
SLD Fisher metric: if one apply the optimal measurement M , 

J pg = JM( Pe ) = 9M( Pe ), 

where the second identity is the assumption of normalization: therefore, due to the monotonicity by the mea- 
surement M, 

9 Pe > 9m ( Pa ) = Jp e - 

Similarly, assume that a metric is not increasing by a classical-quantum (CQ) channel and coincides with 
classical Fisher information for probability distributions. Then, the metric should be no larger than RLD Fisher 
metric: an optimal tangent reverse estimation ($, {pg, dpg/d9}) of the {pg, dpg/d9} satisfies 

Jpe = Jpe ~ 9po i 

where the second identity is due to the assumption of normalization: therefore, due to monotonicity by the CQ 
map $, 

9po — 9pe = Jp ■ 

Here, we have not assumed that the metric is Riemannian, or induced from an inner product in the tangent 
space, different from the argument in [T7]. Also, we have only assumed monotonicity by QC and CQ maps: 

Theorem 3.5 Assume that a (not necessarily Riemannian) metric g coincide with classical Fisher information 
in the space of classical probability distributions. Then, if g is monotone decreasing by a QC map, g is no 
smaller than SLD Fisher metric. If g is monotone decreasing by a CQ map, g is no larger than RLD Fisher 
metric. 

Example 3.6 (Petz metrics) In \I7^ , Petz had shown any monotone Riemannian metric can be written as 

9 f pe {d l pg,d j p g ) := tr d t p e {R pe f (L^R^ 1 ) } 1 djp e , 
where L* pg and R Pe are map form *B (H) to *B (%) with 

L ps (A) = p g A, R P£) (A) = Apg, 
and f is an operator monotone function with 

f(x)^xf(x- 1 ), /(1) = 1. 
For RLD and SLD metric, f (x) = and f (x) — ^4p-, respectively. If f (x) — 

9 f Pe (diPe, djp e ) = ttdipgdj \np e := , 
which is called Bogoljubov-Kubo-Mori (BKM) metric. It had been known that 

jS < jB < jR 
<Jg i >Jg 2; Jg ■ 

Also, 

/w ^° w -( 1 -t) ( ^_ (i i) "(S_ i) <h*» 

is operator monotone for \a\ < 3, and corresponding metric will be denoted by Jg, hereafter. It holds that 

j3 _ t-3 _ jR jl _ j-l _ jB 
Jg — Jg — Jg , Jg — Jg — Jg ■ 

Hence, Jg (1 < a < 3) 'interpolates' between Jf and Jg". In addition, shown 

Jg S <J^<Jg B - (9) 
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4 Non-asymptotic scenario 

4.1 Parallel family of states 

A family {pg} is said to be RLD-parallel if and only if: 

pg =NM e N\ 

and 

M e = diag (p g (1) ,p e (2) , • • • )Pe (r)) , 
N=[\<f> 1 ),\<h),--- Mr)] 

where {\4>\) , • • • , \4>r)} is a linearly independent, normalized, but not necessarily orthogonal system of state 
vectors. This condition is equivalent to 

r 

Pe = ^Pe (x) \<j> x ) ((j> x \ . 

x = l 

Its operational meaning is as follows. Observe that (§,{pg}) is the reverse estimation of {pg}, with $ (S x ) = 
\4>x) (4>x\- The Fisher information Jg of {pg} is easily computed by observing 

L% = iVdiag (flj lnp e (1) , • ■ ■ , d t lnp e (r)) AT 1 , 

(where iV -1 is the Moor-Penrose generalized inverse) and we obtain 

Je = J?. (10) 

Hence this reverse estimation achieves the lowerbound suggested by Theorem l3.4l at any 6. 
Hereafter, let 

p ( t m) :=tp+(l-t)q, p[ m) :=tp+(l-t)a. 

Proposition 4.1 For any p and a with suppp = supper, there is an RLD-parallel manifold containing p, a, 
and p[ m \ for any Q < t < 1. 

Proof.Lct P be the projection onto suppp = supper. Let U be a unitary matrix such that PU (1 — P) = and 

yfpU Va = y/aU^ y/p, 
or equivalently 

where p _1 and cr _1 are generalized inverse. Such U is found out using the polar decomposition of V cr _1 y / p. By 
Proposition l3.3[ there is X with 

y/pU=^X, x = xl 
Let VDV 1 ' be diagonalization of X, and we obtain 

y/pUV = y/^VD. 

Divide xth column vector of \faV by its magnitude and denote the product by 1^). Then letting N := 
[\<t>i) , 1^2) , • ' ' , I'M], we have 

p = 7Vdiag (p(l),.-- , P (r))N\ CT = iVdiag (q (1) , • • • , q (r)) JV+, 
p^^TVdiag^a),-..,^ (r))iVt, 

for some p (1), • • • , p (r), and q (1), • • • , q (r), and the assertion is proved. ■ 
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4.2 Reverse test 

Consider test of the hypothesis 'the given state is p' against the alternative hypothesis 'the given state is a\ 
( Hereafter, such test is referred to as " test l p vs. a' ".) Suppose we are given many copies of the unknown 
states, and the error a n of the first kind, or the probability of rejecting p while p is the true state, vanishes as 
n — > oo. Then in maximizing the exponent of the error /?„ of the second kind, or the probability of rejecting a 
while a is the true state, the key step is optimization of QC map (measurement) M to maximize the relative 
entropy D (M (p® n ) \\M (a® n )). 

We consider reverse test, or the inverse process of (the single copy version of ) the above. Given a pair {p, a} 
of states, let $ be CQ map with 

$ (p) = p, $ (?) = a, 

where {p, q} is a pair of probability distributions. A pair ($,{p.q}) is called a reverse test of {p, a}. (In the 
terminology of statistical decision theory, {p, a} is deficient relative to {p, q}.) Our task is to minimize D (p\\q) 
for all reverse tests. 

To find the optimal reverse test, the following lemma plays a key role : 



Lemma 4.2 (111, Chapter 3, Section 3.5) 

-l ft 
lo Jo 

Let ($, {p.q}) be a reverse test of {p, a}. Then 



D(p||?)= jf J*J pim) dsdt. 



i / (m)\ (r, 

* [Pt ) = Pt 

holds, and^i 1 , |-Pt™' ) }) i s a reverse estimation of |/9| m ' ) |. Therefore, by Lemma l4~2l and Theorem l3.4[ we have 

D(p||«)> / f J%AsAt. 
Jo Jo ^ s 

Also, by 14.11 there is a parallel family which contains jpt"^}- Therefore, 

pe=N& & g(p { r ) {!),-■■ , P { r ] {r))N\ 

holds for some N = \\4>i) , ■ ■ ■ , \4>r)]- Hence, by (JTDJ), the reverse estimation ($ , { Pfl}), where 3>o (S x ) = \<l>x) (<Az| , 
achieves 

and $o (p) — Pi (q) = Therefore, the reverse test ($o, {p, l}) achieves 

p{x) 



pi pt pX pt f 

/ / J R (m) dsdt= / / J ( m) dsdt = p (a;) In 
Jo Jo Pt Jo Jo Ps x _ x 



q(x)' 



and hence is optimal. The right most side integral is computed in [6], although the detail is not described. Here 
we show a way to verify that the left most side equals D R (p||cx), as in [11]. Observe that there is a r x r unitary 
matrix U with 



1 

P 2 


= UD N^ 


= ND U\ 


1 


= UDiN^ 


= ND\U\ 


A 


■= diag (\ 


A m) (l)." 
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Therefore, 

trplnp 2 <T p 2 

=tr UD Q N^ND UUii^UD N^ (UDxN^)' 1 (NDxU*)' 1 ND U^\ 
= trJVtjVdiag (p(l)l n £^,... ,p(r)ln^) 

V 9(1) g(r)y 

Thus we obtain: 
Theorem 2.4 

minD(p||g)=D R (p||<T), 
where minimization is taken for over all reverse tests (&,{p, q}) of {p, a}. 

4.3 Monotone relative entropy 

An example of the quantity satisfying (M) and (N) is D(p||a) = Tip (in p — ln<r). By Theorem l2.1[ we obtain 
another proof of the inequality shown in jS], 

D(p\\a)<D R (p\\a). 

Another example is 

D 9 (p|k) := J J ff p (»)dadt, 
where g is any properly normalized monotone metric. Note = T> J . Also, it is known |14)[6] that 

D jB (p\\a) = B(p\\a). (11) 
Due to Lemma l4T^l D 9 (p||g) = D(p||q) for all probability distributions p, q. Also, since 

A[ P i m) ) =(l-t)A(p)+tA(a), 
D 9 (p\\a) is monotone decreasing by application of CPTP maps: 



D»(A(p)||A(<r)) = / / 9 A(pim)) dsdt 



o Jo 
i r t 



-J L s P ('" ,dsd ^ D9 9||CT) 



Corollary 4.3 



lim ~D jS (p® n \\a® n ) = D(p||cr), Km -D J ° (p® n \\a® n ) = D(p\\a). 
n— too ri n—toc n 

Proof.Since both of 

lim" -D jS (p® n \\a® n ), lim ~V jS (p® n \\o-® n ) 

1% t oo 7^?i ^ — ^ oo 

satisfy (M), (N), and (A), Theorem[2?2l implies 

Km" lD jS (p® n ||a®") 

> lim -D jS (p® n \\a® n ) > B(p\\a). 

n— too 



ii 



On the other hand, since D(pj |cr) = D J (p\\a) and J B > J & 



After all, 



D(pllcr) = ~-D(p® n \\a® n ) > -D jS (p® n \\a® n ). 
n n 



D(p\\a) > lim -D jS (p® n \\cr® n ) 

n— i-OO fl 

> lim -D jS (p® n \\cr® n ) > D(p\\a). 
n— >oo it 



Due to ©, the second identity follows from the first. ■ 
Theorem 4.4 

XD R (p \\a ) + (1 - X)B R { Pl \\a 1 ) > B R {Xp + (1 - A) Pl \\Xa + (1 - A) <7i). 

Proof.Let q y }) be an optimal reverse test of {p y ,o- y } (y — 0,1). Define p yo {x,y) :— p y (x) S yo (y), 

Qyo ( x i V) : = % ( x ) S vo and $ = $ w Then we nave 

* (Pyo) = ^Pv 0) *yo (2/) ®yo 0) = P Vo , 

= D (Pyo I \lyo ) = D fl ' 0s/o I Ko ) • 

AD fl (p || CTo ) + (1 - A) D^GoxHaO = AD(p ||g ) + (1 - A) D(pi||?i) 

> D(Ap + (1 - A)pi||Ag + (1 - A) &) 

> D fi ($ (Ap + (1 - A)pi) ||$ (Ago + (1 - A) gi)) 
= D fl (Ap + (1 - A) Pl \\\a + (1 - A) a ± ). 



Therefore, 



5 Asymptotic scenario 
5.1 Asymptotic reverse test 

The result of the test l p n vs. a n ' is binary, that is, accept p n or a n . Hence, a natural inverse problem would 
be generation of {p n ,a n } from the probability distributions {p n ,q n } over binary set {0,1}. Let us define an 
asymptotic reverse test, or a pair (<&", {p n , Q n }) with 

lim ||$" (p n ) - p n \l = 0, $" (g n ) = cr", 

lim p n (0) = lim q n (1) = 1, (12) 

n— >oo n— >-oo 

and discuss the infimum of 

lim — lng n (0). 
To describe the infimum, we need the following object: 

Dma X ({p"}IIK 1 }) 

:=inf(a; p n < e na a n , lim \\p n - p n \\, = oj . 
The following proposition is trivial. 
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Proposition 5.1 Dj^ ax is monotone decreasing by application of a CPTP map A, 

D max ({ A (/?")}! K A C "")}) < D max ({p"}!!! ""}) > (13) 
and asymptotically continuous about the first argument, 

lim \\p n -p n \\ =0 

n— >oo 

D^ax ({P"}||{^}) = D^ ax ({ P n }\\W n }) ■ (14) 

Theorem 5.2 

inf Km -i m g" (0) = D- ax ({p"}||{a"}) 

n— >oo ™ 

where inf is taken over all the asymptotic reverse test. 

Proof.First, we show '<'. By definition of D^ ax , for any c > 0, it is possible to define <£> n (So) so that 

lim ||*" (<5o) - p"|| x = 0, 

n— »oo 

<f " («o) < ° n ex P {n (D~„ ({p"}| |{* n » + c)} . 



hold. Then, letting 



lim p n (0) = 1, 

n— >oo 

g" (0) = exp {-„ (D^ ax ({p n }\ \{a n }) + c)} , 



we have 



lim ||$" (p") - - lim ||$" (<5 ) - p"|| a = 0, 

<r" - g™ (0) $" (Jo) > 0. 
Therefore, it is possible to define <£>" (Si) so that 

$" (g") = g™ (0) $" (<5 ) + g" (1) (<5i) = <r" 
holds. To sum up, a sequence of reverse test ($™, {p n ,q n }) satisfies the requirement (fT2|) . and satisfies 

Ha ^lng"(0) = D- ax ({p"}||{ ( T"})+ C . 

Since this composition is possible for any c > 0, we have '<'. 
Second, we prove '>'. Observe, due to ([T2|) . 



inf lim -^lng"(0) = D- ax ({p"}||{g"}). 

n— too iv 



Therefore, by monotonicity (fT5|) of D^ ax , 

inf Hm — lng" (0) > ($» (p») ||$" («f )) 

n—too ^ 

= D- X ($»(p")||a"). 

This, by asymptotic continuity ([TUl and (|T^|) . leads to '>'. ■ 

A converse statement of Stein's lemma can be maid using D^ ax , indicating asymptotic reverse test is a 
natural inverse problem of test. 

Corollary 5.3 (converse statement of Stein's lemma) 

Dmax({/°"}II{CT"}) > sup ( lim — log tr P n a n ; lim trp"F" > 0, < P n < 1 
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Proof.Consider ($ n , {p n , q"}) with (fl2|) . and let p n and q n be binary distributions with 

p n (0) := tr P"$™ (p n ) , g™ (0) := tr P"$ n (g rl ) , 
and let P n be a POVM element with lim„_> tr p n P n > 0. Then, 

limp" (0) = Km trP n $" (p n ) > Km {trP>" - - $" (p")||J > 0. (15) 

n— >-0 n— )-0 n— )-0 

Observe that the composition of the map $™ and the measurement {P n , 1 — P"} is a CPTP map, or a 
Markov map from binary distributions onto themselves. Hence, it should be written as 

P n (o) = o5,p"(o) + a5 1 p"(i), 

q n {Q) = a^q n (0) + oft g" (1) . 
By (fT5|) , we should have lim^^ Oq > or lim n _ j . 00 Oqi > . If the former is true, 

Urn — logg™(0)> lim — logg"(0)- En" — log<$ = lim — logg"(0). 

n— ^oo ^ n — ^oo n " n — ^oo rl 

If the latter is true, due to linin^oo q n (1) = 1, we have 



lim — logg™ (0) = 0. 

tj. — ^no /A 



In cither case, we have 



lim — log<f(0)> lim — log<f(0) = lim — logtrP n cr n . 
Also, Theorem l5. 21 implies that there is ($ n , {p n : q 71 }) with 

lim ^logg"(0)<D- x ({p"}||{a"}) + C , 
for any c > 0. Therefore, we have to have the converse statement. ■ 
5.2 Relations between D^ ax , D and D 

Nagaoka[15 defined the following quantity to analyze quantum hypothesis test : 

D({p"}||{fT n }) := inf la; lim trp"{p" - e na a n < 0} = lj , 

where {p n — e na a n < 0} is the projector onto the non-positive eigenspace of p n — e na a n . 
Theorem 5.4 fT5 f fT6 ^ D ({Pn}H{on}) characterizes efficiency of the test 'p n vs. a n ' as follows. 

J5 ({p n }\\{a n }) = inf ja;V{P"} lim — In tr P n a n > a =*> lim trP"p™ = 

Lemma 5.5 (DattaJEj) If 

e:= l-trp n {p" -e n V" < 0} , 

there is a positive operator A n with 

||il n -p n || 1 < v 7 ^, A n < e na a n . 
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Let A n as of LemmaEH and p n := -^A n . Then, 

1 



Wp n -P n \\i 



< 



A n ~A n 



+ \\A n - P n \\ 1 



Also, 



Theorem 5.6 



tr A n 

= \l-ixA n \ + \\A n -p n \\ 1 
= |trp"-trA n | + ||A n -p n || 1 
<2\\A n -p n \\ 1 

< 4V2e. (16) 



ona yna 

P n < — ir° n < T=° n - (17) 



D({p"}||{a n })=D max ({p^||{a"}) 

The proof is analogue of the proof of Theorem 2 of [3], and is given below with minor modihcation in 
accordance with the difference in their D max and our D^ ax . 
Proof.First we show '<'. Let c > and {p n } be a sequence of states with 



Then, for any {P n } with 



we have 



~n < e ri{D max ({p"}||{ CT "})+c} (T r l; 

lim ||p B -p n || 1 =0. 



lim — IntrPV" > D max ({p n }\\{a n }) + 2c, 

n— too W 



lim trP>™ = lim tr P"p" 

n— >oo n—too 

< lim e n{D — ({p " }ll{CT " })+c} trF"cr" = 0. 



Since c > is arbitrary, by the second identity of Theorem l5.4| we have '<'. 
Second, we show >". Let a = D {{p n }\\{a n }) + c (c> 0). Then, 

lim tr p n {p n - e na a n < 0} = 1. 

n— f oo 

Hence, by (ITBl and (IT7|) . we can compose p™ with 

lim - p n |L = 0, p n < e"( Q+c V\ Vc > 03n Vn > n , 

n— »-oo 

or 

D ({p"}||{a"}) + 2c > D max ({p"}||{a"}) . 
Since c, d > are arbitrary, we have the assertion. ■ 

Theorem 5.7 J Wj J Wj 

D(H|a)=D({^"}ll{^"})- 

Corollary 5.8 

D({p®™}||{^"}) =D max ({p 8 "}ll{^"}) = D(/#r). 
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5.3 Asymptotically lower continuous and monotone relative entropy 
Theorem 2.7 If D (po||o"o) > D {p\\a), there is a sequence of TPCP map with 

lim II*" (p® n ) - p® n \\. = 0, (af n ) = a® n . (18) 

Conversely, if {*"} with t!8\) exists, D (po\\o~o) > D(p||er). 
Proof.Suppose D (polloo) > D (p\\cr) and let 

By Theorem l2.51 there is a sequence of projector {P n } with 
p n (0) := tr P n pf n -> 1 (n -> oo) 

g n (0) := trPVjf" < e -n(D(wlko)-c) = e -»(D( P |k)+c) > 3n()Vn > ^ 

Let CPTP map $" as of [El Then, due to D max ({p® n }\\{a® n }) = D (p||cr), the composition * n of the 
measurement {P n , 1 — P n } followed by $ n satisfies (fT5)) . Thus we have the former half of the assertion. 
In the sequel, we prove the latter half. Recall D {p\\cr) satisfies (M), (A), and (C). Therefore, 

D(po|ko) = I™ lv( P T\\o® n ) 
> lim -D(V n (pf n ) \\y n (af n )) 

(M) n->oo fl 

= lim -D(p n ||^™) > HE -D(p 8n ||a 0n ) = D(p||a). 

n->-oo " (C) «■ (A) 

■ 

Corollary 5.9 D F {p\\o~) :— In Uy^y^ctI^ does not satisfy the condition (C). 

Proof.D F | cr) satisfies (M) and (A), but does not equal a constant multiple of D (p||er). Therefore, we must 
have the assertion. ■ 



Theorem 5.10 If g is a properly normalized monotone metric, then 

inf j lim -B 9 (p n \\a® n ) ; lim ||j5 n -p® n |L =0 
= B(p\\a). 

Proof.By Theorem l2.2[ we have to prove the assertion only for g = J R . >' is due to T) R > D and Proposi- 
tion l2.6l To prove '<', let ($ ra , {p n , q n }) be an asymptotic reverse test with 



Then by monotonicity of 



lim —lnq n (0)=D(p\\a) + c. 

n— >oo *v 



Km -D R (<P n (p n ) \\a® n ) < fim -D R (p n \\q n ) 

n— >oo **> n— >oo 

= fim -D(p n \\q n ) 

n— too Tl 

= fim — lng"(0)=D (p\ | a)+c, 

n— f oo n 



which leads to the assertion. 
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6 Conclusions and Discussions 



Using reverse test and asymptotic reverse test, we gave a characterization of quantum versions of relative 
entropy. Note that the uniqueness in the asymptotic scenario is valid also for classical relative entropy : any 
two-point functions over probability distribution with (A), (M) and (C) is constant multiple of relative entropy. 

The condition (C) can be replaced by the following 'weak monotonicity' 10 , which may be a bit more 
natural. 

(WM) (weak monotonicity) If ||p® n - A™ (p®")|| -> 0, a® n = A™ (cr® n ) 

D Q (p||cr)>D« (p\\a). 

It may be interesting to compare the asymptotic behavior of quantum relative entropy and corresponding 
quantum Fisher information (correspondence is made via Lemma l4~2l ). While it is known that J R and J s 
satisfies both of them [12] . D R (p\\a) and D J (p\\a) does not satisfy (C) and (A), respectively. 

Some problems are left open. First, relaxing (C) in the following manner can be interesting : 

(C) (Lower exponential asymptotic continuity) If lim n ->oo "7T m \\P n ~ P® n || — a > 

lim - {D Q (p n \\a® n ) -D Q (p® n ||a®")} > 0. 

n— »oo H 

By relaxing (C) to (C), quantities such as relative Renyi entropy may survive. Second, generalizing Theo- 
rem !5.2l and Theorem l2.7l fbv increasing the numbers of states, changing constraint on error, etc.) may be also 
interesting. 
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7 Reverse estimation for a multi-dimensional parameter family 

By the argument parallel with the 1 — dim- case, we have 

J* (X, X) = mm{J pe (Y, Y) ; (p e ) = Pe, * (Y) = X} . 
Therefore, for any reverse estimation ($, {pe}), 



I < J R 



which, for any real m x m-matrix G > 0, leads to 



Note that 



Tr GJ g > min {TrGJ ; J > J<f } = Tr GUtJ? + Trabs G3J e fl 

= Tr G5i Jf + Trabs GS J 9 fl (19) 



1 



-Trpe [Lf ti , Lf tj ] := J e , 



2 



ij ■ 



This inequality is in many cases not achievable. However, if {pg} is RLD-parallel, ^Jg^j — and the inequality 
is written as 

Tr GJ e > Tr GKJ^, 

which is achievable. Also: 

Example 7.1 Gaussian states are defined by its F '-representation, 



dpdq 
— = exp 
2itN 



1^2 _L f m _ fl2^2n 



2N 



q + ip\ / q + ip 



V2 I \ y/2 



where \z) is the coherent state with complex amplitude z. Being infinite dimensional states, in strict sense, this 
example is out of the scope of our theory. However, the lower bound U9\) can be explicitly computed as 

Tim? + TrabsS* J,f = =. 
a a N 
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Also, using P-representation, one can compose a reverse estimation (generate coherent states according to the 
probability distribution defined by P -function), and 



J = 



N 



N 



^ ,TrJ= = 



N 



achieving the lower bound. 

Note that the optimal measurement for estimation of 8 = , 2 ) is | - ^ \ 
button of the estimate of 9 = (0 1 ,0 2 ) is equal to Q-representation. Moreover, let 



%/2 



}w, 



id the distri- 



E 



N 



N + 1 



U e \n) ® U e \n) G U®K, 



where Ug is the Weyl operator. Then pg — tr*; \tpg) (<p$\ = tr% \fe) (fsl an d 



JV + 1 



4 iV + 1 



Therefore, if one measures JC-part of\(pg) by POVM j - ^ / - j, one obtains 
^^exp| — — 9 - 1 — or realizes i/ie optimal reverse estimation. 



V2 



im'i/l probability 
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